Multilingual Retrieval in Twente
نویسندگان
چکیده
The participation of the University of Twente in CLEF 2003 was very low profile. The eternal pressure on resources known to researchers world-wide caused the focus of activities to be on other projects for some time. Therefore this paper will not go into the techniques with which the results of Twente were produced but rather gives an overview of current work in retrieval at the University of Twente. 1 Current Work in Twente. The TKI reserch group has participated and is participating in several past and ongoing Dutch and European retrieval projects concerned with the disclosure of multimedia archives. Besides tackling the problem of processing the multimedia content, many of these projects focus on using parallel textual content as an extra source of information. This content can help searching effectively on information that cannot be derived from the multimedia content in itself. Within the Dutch project DRUID a lot of work has been done on robust speech recognition for Dutch[2], video segmentation and information extraction and filtering[3]. The IST project ECHO aimed at disclosing historic film materials. The major research themes were contentbased processing of the video material and speech processing. Both projects had a strong focus on cross lingual retrieval of the content, expanding the results of previous projects. The “Waterland project” (an ongoing Dutch project) does not directly concern multilingual retrieval, but it investigates a flexible and generic architecture for distribution of multimedia content. In Pidgin retrieval is not the major theme. However, crosslanguage aspects figure heavily in this project: the resulting demonstrator should be able to generate automatic “translations” of user utterances, to enable conversations between persons using different languages. The translations need not be grammatically perfect, but should be good enough to convey the correct meaning. MUMIS, a recently completed IST project, was aimed at disclosing video recordings of soccer matches. The use of parallel textual reports on those matches played a major role in this project. Furthermore a first start with TKI research on Information Extraction was made in the MUMIS project. The most interesting development was automatic alignment and merging of the extracted information from separate sources, which should improve the quality of retrieval [1]. The above can be summarised as follows: the major part of the TKI research on retrieval focuses on multilingual access to multimedia content and in the future more work will be done on Information Extraction for Dutch and especially on the promising theme of cross document information extraction, using information from one source to improve the extraction for another source. 1http://dis.tpd.tno.nl/druid/ 2http://pc-erato2.iei.pi.cnr.it/echo/ 3http://www.pidgin.nl/ 4http://parlevink.cs.utwente.nl/projects/mumis/
منابع مشابه
University of Twente at GeoCLEF 2006: Geofiltered Document Retrieval
In this report we describe the approach of the University of Twente to the 2006 GeoCLEF task. It is based on retrieval by content and the subsequent filtering by geographical relevance utilizing a gazetteer. The results do not show an improvement in retrieval performance when taking geographical information into account.
متن کاملMARS: Multilingual Access and Retrieval System with Enhanced Query Translation and Document Retrieval
In this paper, we introduce a multilingual access and retrieval system with enhanced query translation and multilingual document retrieval, by mining bilingual terminologies and aligned document directly from the set of comparable corpora which are to be searched upon by users. By extracting bilingual terminologies and aligning bilingual documents with similar content prior to the search proces...
متن کاملDublin City University at CLEF 2004: Experiments in Monolingual, Bilingual and Multilingual Retrieval
The Dublin City University group participated in the monolingual, bilingual and multilingual retrieval tasks this year. The main focus of our investigation this year was extending our retrieval system to document languages other than English, and completing the multilingual task comprising four languages: English, French, Russian and Finnish. Results from our French monolingual experiments indi...
متن کاملExploring the Effects of Language Skills on Multilingual Web Search
Multilingual access is an important area of research, especially given the growth in multilingual users of online resources. A large body of research exists for Cross-Language Information Retrieval (CLIR); however, little of this work has considered the language skills of the end user, a critical factor in providing effective multilingual search functionality. In this paper we describe an exper...
متن کاملMultilingual Information Retrieval in World Wide Web
The article addresses: (1). The design of an information retrieval (IR), as the Multilingual Information Retrieval Tool Hierarchy (MIRTH), which with virtual corpora on the World Wide Web, also known as Web or WWW. It is motivated by the desire to create a search engine to retrieve information by accessing a virtual. (2). The implementation of a general model of multilingual retrieval for the W...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003